Goto

Collaborating Authors

 greedy approximation


Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels

Neural Information Processing Systems

Nonlinear kernels can be approximated using finite-dimensional feature maps for efficient risk minimization. Due to the inherent trade-off between the dimension of the (mapped) feature space and the approximation accuracy, the key problem is to identify promising (explicit) features leading to a satisfactory out-of-sample performance. In this work, we tackle this problem by efficiently choosing such features from multiple kernels in a greedy fashion. Our method sequentially selects these explicit features from a set of candidate features using a correlation metric. We establish an out-of-sample error bound capturing the trade-off between the error in terms of explicit features (approximation error) and the error due to spectral properties of the best model in the Hilbert space associated to the combined kernel (spectral error). The result verifies that when the (best) underlying data model is sparse enough, i.e., the spectral error is negligible, one can control the test error with a small number of explicit features, that can scale poly-logarithmically with data. Our empirical results show that given a fixed number of explicit features, the method can achieve a lower test error with a smaller time cost, compared to the state-of-the-art in data-dependent random features.


Reviews: Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels

Neural Information Processing Systems

In particular, [1, Algorithm 3] propose an approach for minimization of the expected loss of a linear predictor that aims at finding a good' sparse solution. The main idea of the algorithm from [1] is to iteratively add features by picking a previously unselected feature that amounts to the largest reduction in the expected risk. Then, a linear model is trained using the extended feature representation and afterwards the whole process is repeated. The authors use pretty much the same idea and take a large dictionary of features to represent the data. Following this, they run Algorithm 3 from [1] to pick informative' features and generate a sparse feature representation.


Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels

Shahrampour, Shahin, Tarokh, Vahid

Neural Information Processing Systems

Nonlinear kernels can be approximated using finite-dimensional feature maps for efficient risk minimization. Due to the inherent trade-off between the dimension of the (mapped) feature space and the approximation accuracy, the key problem is to identify promising (explicit) features leading to a satisfactory out-of-sample performance. In this work, we tackle this problem by efficiently choosing such features from multiple kernels in a greedy fashion. Our method sequentially selects these explicit features from a set of candidate features using a correlation metric. We establish an out-of-sample error bound capturing the trade-off between the error in terms of explicit features (approximation error) and the error due to spectral properties of the best model in the Hilbert space associated to the combined kernel (spectral error).


Feasibility of random basis function approximators for modeling and control

Tyukin, Ivan, Prokhorov, Danil

arXiv.org Artificial Intelligence

Abstract-- We discuss the role of random basis function approximators in modeling and control. We analyze the published work on random basis function approximators and demonstrate that their favorable error rate of convergence O(1/n) is guaranteed only with very substantial computational resources. We also discuss implications of our analysis for applications of neural networks in modeling and control. I. INTRODUCTION Efficient modeling and control of complex systems in the presence of uncertainties is important for modern engineering. This is especially true in the domain of intelligent systems that are designed to operate in uncertain environments. Physical models of such relations f(·) are not always available, and it is quite common to use mathematical substitutes such as, e.g., superpositions of (basis) functions that are capable of approximating a-priori unknown f(·) with the required precision.